Causal relationships between lung cancer and sepsis: a genetic correlation and multivariate mendelian randomization analysis

Background Former research has emphasized a correlation between lung cancer (LC) and sepsis, but the causative link remains unclear. Method This study used univariate Mendelian Randomization (MR) to explore the causal relationship between LC, its subtypes, and sepsis. Linkage Disequilibrium Score (LDSC) regression was used to calculate genetic correlations. Multivariate MR was applied to investigate the role of seven confounding factors. The primary method utilized was inverse-variance-weighted (IVW), supplemented by sensitivity analyses to assess directionality, heterogeneity, and result robustness. Results LDSC analysis revealed a significant genetic correlation between LC and sepsis (genetic correlation = 0.325, p = 0.014). Following false discovery rate (FDR) correction, strong evidence suggested that genetically predicted LC (OR = 1.172, 95% CI 1.083–1.269, p = 8.29 × 10−5, P fdr = 2.49 × 10−4), squamous cell lung carcinoma (OR = 1.098, 95% CI 1.021–1.181, p = 0.012, P fdr = 0.012), and lung adenocarcinoma (OR = 1.098, 95% CI 1.024–1.178, p = 0.009, P fdr = 0.012) are linked to an increased incidence of sepsis. Suggestive evidence was also found for small cell lung carcinoma (Wald ratio: OR = 1.156, 95% CI 1.047–1.277, p = 0.004) in relation to sepsis. The multivariate MR suggested that the partial impact of all LC subtypes on sepsis might be mediated through body mass index. Reverse analysis did not find a causal relationship (p > 0.05 and P fdr > 0.05). Conclusion The study suggests a causative link between LC and increased sepsis risk, underscoring the need for integrated sepsis management in LC patients.


Introduction
Sepsis is described as "a life-threatening organ dysfunction caused by dysregulated host systemic inflammatory and immune response to infection".(Singer et al., 2016).Sepsis remains a serious global health challenge (Tiru et al., 2015;Zhu et al., 2022).Sepsis impacted close to 50 million individuals globally and accounted for about 20% of global deaths before the COVID-19 pandemic (Rudd et al., 2020).Key factors contributing to the development of sepsis include the pathogen's virulence, the site and type of infection, and host factors such as age, genetic predisposition, and comorbidities (Mao et al., 2013;Pan et al., 2017;Moon et al., 2023).Despite advancements in understanding its mechanisms, sepsis remains a significant challenge in healthcare due to its rapid progression and high mortality rate (Bauer et al., 2020).Early recognition and prompt management are crucial in improving patient outcomes.
Lung cancer (LC) is responsible for 2.2 million new cases each year, ranking as the world's second most prevalent cancer.Furthermore, it is the primary cause of cancer-related mortality, resulting in approximately 1.79 million deaths annually (Sung et al., 2021;Thai et al., 2021;Huang et al., 2022).From a pathological classification perspective, LC can be roughly divided into two types: non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC).NSCLC primarily includes histological subtypes such as adenocarcinoma (LUAD) and squamous cell carcinoma (SqCLC).The relationship between LC and sepsis is complex and intricate (Pavon et al., 2013;Hensley et al., 2019;Mirouse et al., 2020;Moore et al., 2020;Shvetsov et al., 2021;Xia et al., 2022).Most research evidence supports the idea that LC poses a risk for sepsis (Pavon et al., 2013;Hensley et al., 2019;Moore et al., 2020;Xia et al., 2022).The immunosuppressive state induced by the tumor itself or by cancer treatments can make these patients more susceptible to infections (Pavon et al., 2013;Hensley et al., 2019;Xia et al., 2022).The results of a prospective study on the likelihood of sepsis following cancer showed a heightened risk of sepsis in cancer survivors, these cancers include lung cancer, breast cancer, prostate cancer, and other solid tumors, as well as hematological tumors (Moore et al., 2020).Additionally, studies have shown that the incidence and mortality rates of sepsis among cancer patients are higher, severe sepsis is associated with 8.5% of all cancer deaths, costing 3.4 billion dollars per year (Williams et al., 2004).However, some study results differ from this view; for example, a study suggests both complementary and antagonistic relationships between cancer and sepsis (Mirouse et al., 2020).Yurii B. Shvetsov and colleagues, in a multiethnic cohort study concerning the association between sepsis mortality and specific cancer sites and treatment types, found that lung cancer was associated with a significantly lower increase in sepsis mortality compared to non-sepsis mortality (Shvetsov et al., 2021).
Conversely, there is currently no consensus on whether sepsis increases the risk of cancer incidence.A multicenter observational study suggests that the incidence of sepsis does not alter the oncological and prognostic results in patients with epithelial ovarian cancer (Said et al., 2023).However, another study confirms that sepsis was significantly linked to a higher risk of nine types of cancer within 5 years after sepsis diagnosis, including LC (Liu Z. et al., 2019).
Given these inconsistent academic findings and the constraints of observational research in establishing cause-and-effect relationships, Mendelian randomization (MR) can provide insights into causality that observational studies lack.MR uses genetic variations as instrumental variables (IVs) derived from genomic-wide linkage analyses for causality inference.MR functions like a natural randomized controlled trial (RCT), offering more substantial evidence and less vulnerability to confounding factors than observational studies.MR is extensively used in cancer and disease research (Li and Wang, 2023;Xu et al., 2023).Hence, performing a bidirectional MR study could be critical in deciphering causal links between sepsis and LC, paving the way for better prevention and therapies.

Study design
This study explores the causal relationship between LC and sepsis using summary-level data from the largest publicly accessible genome-wide association study (GWAS) currently available on these conditions.A suite of sophisticated analyses was conducted, incorporating bidirectional univariate MR, complementary multivariable MR (MVMR) analysis, and in-depth genetic correlation evaluations.IVs for exposure were established based on stringent criteria: (i) strong association of the genetic instrument with the exposure; (ii) independence of the instrument from confounding variables; (iii) the exclusive pathway of the genetic variants' impact on the outcome is through the exposure (Lawlor et al., 2008).The methodological intricacies of the MR framework are presented in Figure 1, while the comprehensive summary data are systematically detailed in Table 1.This study is reported following the Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization guidelines (STROBE-MR) (Skrivankova et al., 2021).

Selection of genetic instrumental variables
The MR analysis operationalized rigorous selection parameters for Single nucleotide polymorphism (SNP) identification: (i) SNP, to serve as instrumental variables, showcased genome-wide significant associations with the exposure (p < 5 × 10 −8 ).In the reverse analysis, due to the inability to obtain SNPs at the genome-wide significance level for the sepsis phenotype, we adjusted to a more relaxed threshold (p < 5 × 10 −6 ) based on previous MR analysis experience to acquire a sufficient number of SNPs for the analysis (Liu et al., 2024;Xu et al., 2024;Yang et al., 2024).(ii) The selection of SNPs underwent rigorous scrutiny to exclude confounding variable associations and to confirm independence, thus preventing biases due to linkage disequilibrium (r 2 < 0.001, clumping distance = 10,000 kb).(iii) SNP validity as instrumental variables was gauged by F-statistics (F = R 2 ×(N− 2)/(1 − R 2 )), where R 2 denotes the percentage of variance in the exposure explained by the SNPs, and N is the sample size of the GWAS from which the exposure is drawn).This criterion helped to eliminate weak instruments, with an F-statistic threshold of >10, ensuring robust instrument strength (Teumer, 2018).(iv) MR-Steiger filtering was applied to remove variants demonstrating stronger associations with outcomes than with exposures (Hemani et al., 2017).(v) In instances of SNP unavailability in the outcome dataset, the SNiPa web interface (http://snipa.helmholtz-muenchen.de/ snipa3/) was employed, leveraging genotype data from the European cohort of the 1000 Genomes Project Phase 3, to locate a proxy SNP in strong linkage disequilibrium with the primary SNP (r 2 > 0.8).(vi) Consistency was essential, with the SNP's effects on exposure and outcome required to be in the same allelic direction.

Source of lung cancer phenotype
This study utilizes the most extensive dataset to date, drawn from a meta-analysis by McKay JD et al., encompassing European ancestry GWAS for LC with 29,266 cases and 56,450 controls, SqCLC with 7,426 cases and 55,627 controls, LUAD with 11,273 cases and 55,483 controls, and SCLC with 2,664 cases and 21,444 controls (McKay et al., 2017).The research integrates novel data from the OncoArray genotyping platform with existing data from previous LC GWAS, conducting a large-scale association analysis on over 29,000 patients and 56,000 controls of European descent.

Source of sepsis phenotype
The latest and most exhaustive aggregate GWAS analysis about sepsis is derived from the UK Biobank (Sudlow et al., 2015).Methodological adjustments in this study included age, sex, ten principal genetic components, and genotyping batch effects.It comprised 11,643 sepsis cases juxtaposed against 474,841 controls, all of European descent.Case identification hinged on the presence of ICD-10 codes A02, A39, A40, and A41.

MR analysis
The univariate MR framework evaluated individual IVs using the Wald ratio, which calculates the causal effect by dividing SNP-outcome association (β_Y) by the SNP-exposure association (β_X).This method provides a causal estimate for each genetic variant, assuming IVs significantly influence the exposure, are independent of confounders, and affect the outcome solely through the exposure (Burgess et al., 2013).Concurrently, to elucidate the causal associations involving multiple IVs (two or more), use the multiplicative random-effects inverse-variance-weighted (IVW) method (Burgess et al., 2013).It is critical to note that when the heterogeneity index I 2 is below 50%, outcomes derived from the fixed-effects model are considered robust.This statistical strategy was additionally refined by incorporating the MR-Egger and weighted median methodologies.The IVW technique's weighting schema is coherent with the Wald ratio estimates for each SNP, inversely correlating with its variance (Hemani et al., 2018).By integrating the full complement of genetic variants, the IVW approach ensures systematic and reliable results.Contrariwise, the weighted median method gains prominence when more than half of the genetic variants are presumed invalid; concurrently, the MR-Egger method operates under the premise that all such variants are invalid Scatterplot summary of all MR analyses.The vertical and horizontal lines denote the 95% confidence intervals for the effect size, while the slopes of the fitted lines indicate the estimated Mendelian randomization effect per method.(A) Lung adenocarcinoma on sepsis (B) Lung cancer on sepsis (C) Squamous cell lung carcinoma on sepsis (D) Sepsis on lung adenocarcinoma (E) Sepsis on small cell lung carcinoma (F) Sepsis on lung cancer (G) Sepsis on squamous cell lung carcinoma.MR, Mendelian randomization; SNP, single nucleotide polymorphism.(Bowden et al., 2016).Furthermore, the constrained maximum likelihood (CML) process was employed, allowing for collective analysis over an expansive array of genetic variants while adjusting for possible confounders and intrinsic genetic heterogeneity.Particularly when addressing a comprehensive array of genetic variants and confounders, the CML method is indispensable for obtaining accurate and robust results (Zhang et al., 2008).
Additional MVMR analyses were carried out to delineate the direct causal pathways from exposure to outcome (Burgess and Thompson, 2015).These analyses were to define the direct causal connections precisely, thereby differentiating them from the univariate MR model.Contrary to UVMR, which concentrates on a singular exposure, MVMR considers genetic variations linked to multiple exposures.The initial stage involved generating Mendelian Randomization effect estimates for the exposure-to-outcome relationships using the IVW method.Subsequently, an MVMR assessment was carried out to assess the impact of six mediators on the outcome, considering the specific attributes of the exposure.

LDSC regression analysis
The linkage disequilibrium score (LDSC) regression, designed for analyzing GWAS summary data, constitutes an effective instrument for discerning genetic correlations among complex diseases or traits.This method facilitates the separation of authentic polygenic influences from confounding factors, which include subtle family structures and population stratification (Bulik-Sullivan et al., 2015).A significant genetic correlation, characterized by statistical solidity and substantial effect size, implies that the correlation between phenotypes is not merely attributable to environmental influences.The LDSC, available at (https://github.com/bulik/ldsc),provides a direct path for investigating the genetic foundations linking exposure and outcome traits.

Sensitivity analysis
Diversity among chosen genetic variants was measured using Cochran's Q test, with a p-value less than 0.05, denoting notable differences within the SNPs under study (Kulinskaya et al., 2020).MR-Egger regression was utilized to investigate directional pleiotropy within the MR context (Burgess and Thompson, 2017).An MR-Egger intercept with a p-value below 0.05 indicates notable directional pleiotropy despite the acknowledged limitations of this method (Wu et al., 2020).The MR Pleiotropy Residual Sum and Outlier (MR-PRESSO) approach was used to pinpoint outliers and evaluate horizontal pleiotropy, with a global p-value below 0.05 confirming its presence (Verbanck et al., 2018).Outliers were rigorously removed to refine analytical accuracy.This was followed by a leave-one-out sensitivity analysis to appraise the effect of single SNPs on the Summary of results from MR analysis of genetically predicted lung cancer phenotypes for sepsis.CML, constrained maximum likelihood; FDR, false discovery rate; IVW, inverse-variance-weighted; LC, lung cancer; LUAD, lung adenocarcinoma; MR, Mendelian randomization; MR-PRESSO, MR Pleiotropy Residual Sum and Outlier; OR, odd ratio; P-val, p-value; SCLC, small cell lung carcinoma; SqCLC, squamous cell lung carcinoma; SNP, single nucleotide polymorphism.
collective outcomes (Cheng et al., 2017).The false discovery rate (FDR) method was employed to correct multiple comparisons rigorously.Post-correction, p-values less than 0.05 denoted significant causal associations.Conversely, outcomes with raw p-values under 0.05 that did not maintain this significance after FDR adjustment were classified as suggestive rather than definitive.
For the calculation of R 2 values, equation 2×MAF×(1-MAF)×beta 2 was used, where MAF represents the minor allele frequency for each SNP.These calculated R 2 values were then gathered to establish the combined parameter for power calculation (Guan et al., 2014).The mRnd platform (Brion et al., 2013) (https://shiny.cnsgenomics.com/mRnd/) provided the means for statistical power assessment.

Genetic instrument selection and genetic correlation between phenotypes
The research findings suggest that the F-statistics for all instrumental variables surpassed 500, suggesting a significant decrease in bias due to weak instrument variation.The quantity of SNPs selected as instrumental variables ranged from one to 14, with the explained variance in genetic variation ranging from 1.75% to 12.36% (Supplementary Table S1).The scatter plot (Figure 2) provides an intuitive representation of the direction of causal associations, while the forest plot (Supplementary Figure S1) displays the effects contributed by all IVs.Detailed SNP information can be found in Supplementary Tables S2-S8.
In the sensitivity analysis (Table 2), all IVs passed the MR-Steiger filtering, and Cochran's Q statistic indicated no significant heterogeneity (p > 0.05).Likewise, MR-Egger and MR-PRESSO tests revealed no pleiotropy (p > 0.05).Leave-one-out analysis confirmed that the causal inference was not driven by any single SNP (Supplementary Figure S2), and the funnel plot exhibited a symmetrical distribution (Supplementary Figure S3).

Discussion
This research undertook a comprehensive MR analysis to explore the link between genetic predisposition to sepsis and LC.The results of the MR support earlier epidemiological research (Pavon et al., 2013;Rhee et al., 2017;Xia et al., 2022), confirming a causal link between sepsis and LC.Furthermore, no reverse causal link was found between LC and sepsis.Additional MVMR analysis suggested that factors such as body mass index, level of education, type 2 diabetes, and information on daily cigarette consumption might play a role in mediating part of this causative link.
Prior studies have indicated a link between sepsis and LC, with results showing a correlation between LC and an increased likelihood of sepsis (Pavon et al., 2013;Hensley et al., 2019;Xia et al., 2022).The immunosuppressive state induced by the tumor itself or by cancer treatments can make these patients more susceptible to infections (Pavon et al., 2013).In a group of more than one million hospital admissions for sepsis in the U.S., over 20% had a connection to cancer (Hensley et al., 2019).The results of a prospective study on the risk of sepsis after cancer showed an increased risk of sepsis in cancer survivors (Moore et al., 2020).Nonetheless, previous research has yielded inconsistent results concerning the relationship between general sepsis and LC.Notably, one study demonstrated that the relationship between cancer and sepsis is complementary and antagonistic (Mirouse et al., 2020).Our results are similar to the results of previous mainstream studies.Our MR results support the idea that LC contributes to sepsis.
Alternatively, there is currently no consensus on whether sepsis increases the risk of cancer incidence.A multicenter observational study suggests that the occurrence of sepsis does not affect the oncological and survival outcomes in patients with epithelial ovarian cancer (Said et al., 2023).However, another study confirms that sepsis was notably linked with a heightened risk of nine different cancer types in 5 years after a sepsis diagnosis, including LC (20).In our study, sepsis was not found to act as a genetic predisposing element for LC.
Based on previous disparities in research and incorporating the findings of this study, we posit that, given cancer and sepsis are not singularly unique diseases, it is evident that the risk associated with breast cancer differs from that of pancreatic cancer.Biological distinctions also exist between solid malignancies and malignancies of the hematopoietic system.Consequently, categorizing all cancer or sepsis patients uniformly is erroneous, and treatments conducted in this context are likely to yield suboptimal outcomes.Sepsis, akin to cancer, exhibits intricate diversity.Gaining a deeper understanding of the distinctive physiological states induced by sepsis and cancer is complex and crucial.
Observational studies often face limitations due to unobserved confounding factors and reverse causality, focusing more on correlation than causation.While data indicates a connection, the causal link between sepsis and LC has not yet been conclusively proven.We used MR analysis to investigate the genetic underpinnings of the causative link between sepsis and LC to counteract biases and confounders.Our study indicates that we must pay attention to monitoring patients with LC infection and inflammation factors and prevention and intervention in promptly treating sepsis.
Further MVMR analyses underscored the significance of BMI, educational attainment, T2DM, and data on cigarettes per day.
Firstly, high BMI, especially obesity, may increase the risk of infection because obesity may affect the function of the immune system and may be associated with chronic inflammation.An MR study also showed that obesity was linked to a heightened likelihood of developing sepsis (Hu et al., 2023).Secondly, educational attainment may indirectly affect an individual's risk of sepsis by influencing their lifestyle, health behaviors, access to medical resources, and the environment in which they live and work.For example, previous observational research has shown that lower educational attainment (EA) levels are connected to a heightened likelihood of COVID-19 (Jian et al., 2021).Additionally, individuals with diabetes have a higher propensity to develop wounds and ulcers that do not heal and can become infected, resulting in sepsis.In addition, diabetes alters the immune system, leading to an increased risk of sepsis (Schuetz et al., 2011).Lastly, smoking may elevate infection risk by increasing proinflammatory cytokines, damaging endothelial cells, and correlates with poor health habits (Alroumi et al., 2018;Zhang et al., 2022).A causative link between smoking and infectious disease risk was also shown in an MR study (Zhu et al., 2023).Therefore, this implies a comprehensive strategy for managing sepsis, considering these factors combined.Our research has several advantages.This MR study represents the inaugural exploration of the causative link between sepsis and LC at the genetic level.All the SNPs selected as instrumental variables (IVs) originated from the European demographic, thus diminishing the probability of population stratification bias and bolstering the credibility of the bidirectional MR hypothesis.Our robust tools in this research (such as an F statistic significantly exceeding 10) should mitigate potential bias from sample overlap.Nonetheless, our investigation has its limitations.Several initial exposures were sourced from the UKB cohort, and the absence of additional GWAS hindered the execution of a confirmatory control analysis.Additionally, the exclusive access to summary-level GWAS data impeded the conduct of more detailed subgroup analyses.
In conclusion, our study used a comprehensive approach to investigate the association between lung cancer and sepsis, providing novel insights.Our results suggest that LC is a significant risk factor for sepsis, however, sepsis was not found to act as a genetic predisposing factor for LC.Further in-depth research is warranted to unravel the additional intricacies of this relationship.These efforts underscore the need for integrated sepsis management in LC patients.

Conclusion
To summarize, our study establishes a causal relationship between LC and increased risk of sepsis, with no evidence for a reverse association.Comprehensive prevention and treatment of sepsis should be carried out in LC patients, especially those with high BMI, low educational attainment, T2DM, and smoking.

FIGURE 1
FIGURE 1 Overview of research design and analysis strategy.Overview of the research design.The MR framework is based on three fundamental MR assumptions.IVs, instrumental variables; MR, Mendelian randomization; MR-PRESSO, MR Pleiotropy Residual Sum and Outlier; SNP, single nucleotide polymorphism.

TABLE 1
Detailed information of data sources.